Identify creative dishes: Sushi Sanwiches

Deep Learning assignment with public available data

Ángel Martínez-Tenor

September 2018 (Last Updated in May 2021)

Table of Contents

Description

Goal: Identify samples that could potentially be considered as a combination of two dishes given their pictures

Input: Two separated folders with pictures of each class. The example provided here uses a dataset with 402 pictures of sandwiches and 402 pictures of sushi. Link

Only the best model obtained is shown here: MobileNet with input size (224,224) pretrained with Imagenet with a small fully connected classified trained and tuned with this data.

This implementation is largely influenced and reuses code from the following sources:

Setup

Download, extract & split the pictures (train, validation)

Explore the data

ML Modelling

Create image generators with data augmentation

Use a pretrained convolutional model to extract the bottleneck features

Build and train the top classifier

Build the complete trained model

Make predictions, and the identified potential dishes

Potential Dishes = pictures misclassified or with output (sigmoid) $\in$ (0.45, 0.55). Only the validation set is used here to avoid trained samples

Analysis of results and & Future work

The best model obtained, based on transfer learning with a pretrained MobileNet, achieved accuracies between 89-92% on the validation set. Less than 80% of accuracy was obtained with smaller custom convolutional models without transfer learning.

The generator of the augmented images used to train the classifier is based on the fact that the dishes are usually centered and photographed from different angles.

The identified potential dishes contain both actual potential combination and no combination at all. New potential dishes can be obtained by changing the 'SEED' parameter in the main script (different validation set).

Better accuracies of the classifier can be obtained by training with a large dataset or by fine-tuning the top layers of the pre-trained MobileNet network. However, it is likely that the identification of potential dishes does not improve.

Alternate advanced methods could include Style Transfer or using Generative Adversarial Networks for combining data, as RemixNet.